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We study nonparametric change-point estimation from indirect 
noisy observations. Focusing on the white noise convolution model, 
we consider two classes of functions that are smooth apart from 
the change-point. We establish lower bounds on the minimax risk 
in estimating the change-point and develop rate optimal estimation 
procedures. The results demonstrate that the best achievable rates 
of convergence are determined both by smoothness of the function 
away from the change-point and by the degree of ill-posedness of the 
convolution operator. Optimality is obtained by introducing a new 
technique that involves, as a key element, detection of zero cross- 
ings of an estimate of the properly smoothed second derivative of the 
underlying function. 

1. Introduction. In this paper we study the problem of change-point 
estimation from indirect and noisy observations. Let / G L2(M) denote the 
unknown function. Consider the white noise model 

(1) dY(x) = {Kf)(x)dx + edW(x), xeR, 

where W(-) is the standard two-sided Wiener process on R, < s < 1, and 
K is the convolution operator with kernel K E Li(R) whose action on a 
function / G L2(M) is defined by 

POO 

(2) (K/)(x)=/ K(x-y)f(y)dy. 
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We assume that / is smooth apart from a jump discontinuity of the first 
kind at a point 9 and, without loss of generality, we suppose that 9 € [0, 1]. 
The problem is to estimate the change-point 9 based on the observation of 
a trajectory of the process V(-) that satisfies (1). 

We study this problem in a minimax framework. Let 9 be an estimator of 
9 based on observation of V(-) that satisfies (1). We measure the accuracy 
of 9 by the maximal risk 

R £ [9;G]=su V {E f \9-9\ 2 } 1 / 2 
fee 

over a class of functions Q that have a single change-point 9 G [0, 1] . Here 
Kf denotes the expectation with respect to the probability distribution 
generated by the model (1) with given /. The minimax risk is defined by 

R*[G]=mfR £ [9;g], 
e 

where the infimum is taken over all possible estimators of 9. An estimator 
9 is called rate optimal on the class Q if it satisfies 

R £ [9;G] ~R*[G] ase^O. 

Our aim is to find rate optimal change-point estimators and to establish 
asymptotics of minimax risks for some natural classes of functions Q and 
operators K. 

Change-points and singularities are intrinsic features of signals that ap- 
pear in a wide variety of applied contexts in economics, medicine and phys- 
ical science. For many types of signals, change-points convey important in- 
formation about underlying phenomena. For instance, in images, disconti- 
nuities of the intensity function correspond to the location of the contour 
of an object that may be particularly important for recognition purposes. 
We refer to the volume by Carlstein, Miiller and Siegmund [4] for a com- 
prehensive survey of the area and references. The problem of nonpar ametric 
change-point estimation has been extensively studied in the case where the 
observations of / are direct, that is, when K is the identity operator. For 
such a model, Korostelev [15] constructed a rate optimal estimator of 9 and 
derived the optimal rates of convergence. He showed that the minimax risk 
over the class of functions that have a single change-point and satisfy the 
Lipschitz condition away from the change-point, converges to zero at the 
rate e 2 , which is faster than the usual parametric rate. (Here and in what 
follows we have in mind a standard correspondence between the Gaussian 
white noise model and discrete sample models (cf. [3]), given by the calibra- 
tion e = n -1 / 2 , where n is the sample size. With this calibration, the term 
parametric rate refers to convergence with the rate e = n~ 1 / 2 .) For further 
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work on nonparametric estimation of change-points and discontinuous func- 
tions from direct observations, see, for example, [1, 9, 18, 21, 22, 25, 27] 
and the references cited therein. On the other hand, nonparametric estima- 
tion of a change-point from indirect observations, that is, for operators K 
that are not the identity, is much less studied. Furthermore, the literature 
contains some contradictory statements with regard to best achievable rates 
of convergence and therefore, leaves open the question of how to construct 
optimal estimators. 

An important result in this area is due to Neumann [19], who investigated 
the problem of change-point estimation from indirect observations in a den- 
sity deconvolution model. He assumes that the observations are Yi = Xi + £j , 
i = 1, ... ,n, where Xi are i.i.d. random variables with unknown probability 
density / and where are i.i.d. random errors, independent of the X^s, 
with known probability density K. The problem considered by Neumann 
[19] is to estimate the location 9 of a discontinuity jump in /, where this 
density is assumed to satisfy a Lipschitz condition away from the change- 
point. Neumann [19] proved that the order of the minimax risk in estimating 
6 is min{n -2// ( 2 ^ +3 ) , n _1 /( 2 ^ +1 )}, provided that the tails of the characteristic 
function K(u>) of decrease at the rate M - ^, /3 > 0. In the nonparametric 
regression context, Raimondo [21] considered the problem of estimating a 
change-point in the /3th derivative of the regression function. Assuming that 
this derivative satisfies a Lipschitz condition apart from the change-point 6, 
Raimondo [21] claims that the best rate of convergence in estimating 9 is 
ri - 1 /( 2 / 3 + 1 ). Estimation procedures that achieve this rate were also proposed 
by Wang [26] and, more recently, by Huh and Carriere [13] and Park and 
Kim [20]. Clearly, if K is the Green's function of a linear differential op- 
erator of integer order (3, estimating the change-point 6 of / from indirect 
observation as in model (1) is equivalent to estimating the change-point in 
the derivative of order (3 from direct observations. This fact indicates that 
there is a discrepancy between the rates of convergence obtained by Neu- 
mann [19], on the one hand, and by Raimondo [21] and other authors cited 
above, on the other hand. In particular, the rates obtained by Neumann [19] 
are faster. Although asymptotic equivalence between the two indirect ob- 
servation models (the density model as in [19], and regression/white noise 
model as in [21]) has not been established formally, it would seem natural 
to expect that the rates of convergence are in agreement. In what follows, 
we will show that the "faster" rates of Neumann [19] can indeed be attained 
and they are optimal for the white noise model (1). This fact will be deduced 
from more general results. 

We study the problem of change-point estimation in model (1) for two dif- 
ferent scales of functional classes Q that quantify smoothness of / away from 
the change-point. We derive lower bounds on the minimax risk (see Theo- 
rems 2 and 4) and develop rate optimal estimators (see Theorems 1 and 3). 



4 



A. GOLDENSHLUGER, A. TSYBAKOV AND A. ZEEVI 



In particular, we show that if / can be represented as the sum of a jump func- 
tion and a smooth function whose mth derivative exists and is bounded for 
all x, then the minimax risk in estimating 6 is of order min{e^ 2m+2 ^^ 2m+2 ^ +1 ' , 
£ 2/(2/3+i)|^ provide that the tails of the Fourier transform K of K behave 
like M - ' 3 , as \uj\ — > oo, with (3 > 0. The elbow in the rates of convergence 
corresponds to the cases where /3 > 1/2 and < (3 < 1/2. If (3 > 1/2, the con- 
volution kernel K belongs to L2(M). In what follows we call such convolution 
kernels and the corresponding setup regular. In contrast, under < f3 < 1/2, 
the convolution kernel K does not belong to L2(M). We will call the latter 
case singular because it necessarily corresponds to a singular convolution 
integral in (2). 

We introduce a new estimation technique that involves, as a key element, 
detection of zero crossings of an estimate of a properly smoothed second 
derivative of /. This differs from most change-point detection methods de- 
scribed in the statistical literature that typically use a properly smoothed 
first derivative of /. On the other hand, our second derivative based ap- 
proach has parallels in digital image processing in the context of edge de- 
tection, where it is often referred to as the Laplacian method (cf. [11]). It 
is interesting to note that in the regular case seemingly intuitive procedures 
based on detecting a maximum in the first derivative lead to slower rates of 
convergence (see further discussion in Section 5). 

The optimal rate of convergence in the regular case, e ( 2m + 2 )/( 2m + 2 / 3 + 1 ) ; 
clarifies how smoothness of / away from the change-point (given by the 
index m) and ill-posedness of the kernel K (given by (3) affect achievable 
accuracy in change-point estimation from indirect observations. The result of 
Neumann [19] in the density deconvolution model, with standard calibration 
e = n -1 / 2 , can be viewed as the "density analog" of a special case of our 
result with m = 1, that is, when / is Lipschitz apart from the change-point. 
When the "smooth part" of the unknown function / is analytic, our results 
show that in the regular case the optimal rate is e, up to a logarithmic factor 
in e , that is, it is nearly the parametric rate. Interestingly, in this case 
the ill-posedness index [3 of K appears in the risk bound only as a power of 
the logarithmic factor. This means that ill-posedness of K does not affect 
significantly the quality of estimation when / is very smooth apart from 
the change-point. We also show that in the singular case the optimal rate 
of convergence is e 2 /( 2 ^ +1 ), up to a logarithmic factor, regardless of the 
smoothness of / away from the change-point. 

Our results elucidate the following important feature of the problem: when 
estimating a change-point from indirect data, the best achievable rates of 
convergence depend on the behavior of the function / away from the change- 
point location. This is in striking contrast to the direct observations case, 
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where the rate e 2 is the best one can achieve regardless of how many deriva- 
tives / possesses apart from the discontinuity jump. Our results also indi- 
cate that the procedure of Raimondo [21] is not optimal when estimating a 
change-point of the /3th derivative, 0>1, in the direct observation model, 
contrary to what is claimed in that paper (see further discussion in Sec- 
tion 5). We note that estimating a change-point from indirect observations 
can be done with higher accuracy than curve estimation in nonparametric 
deconvolution (see, e.g., [5, 6, 7, 10], and [8]). 

The rest of the paper is organized as follows. Section 2 introduces nota- 
tion and definitions of the functional classes. In Section 3 we construct a 
probe functional that is used for detection of the change-point from indirect 
observations; some properties of the probe functional are discussed, and its 
estimator is developed. Section 4 describes the two-stage change-point esti- 
mation procedure and presents our main results. Section 5 concludes with a 
discussion of the main results and Section 6 contains the proofs. 

2. Preliminaries. We begin with some notation and definitions. Let g or 
(Fg) denote the Fourier transform of a function g G L,2(R), in particular, if 
S?eLi(R)nL 2 (K), 

g(cj) = (F 5 )(u,) ^ r g(x)e 2 ™ x dx, w6R. 

J — oo 

Let /(x±) = lim t ^ x ± f (t) be the one-sided limits of / at point x and let 
[f](x) = f(x+) — f(x—) be the local jump function. We say that 8 G R is a 
change-point of / if [/](#) 7^ 0, and f(9+) and f(8—) are finite. 

We will consider minimax estimation of a change-point 6 of / by assuming 
that / belongs to one of the two functional classes, T m or A u , defined below. 

Definition 1. Let a,L > be fixed constants. We say that / G T\ = 
JP"i(a, L) if / G L2(M) and if / has a single change-point 8 G [0, 1] such that 
[[/](#)[> a and 

\f(x)-f(x')\ <L\x-x'\ Vx,x' eR,x<x',0(£ [x,x']. 

The class T\ contains functions / that have a single jump discontinuity of 
the first kind at 6 £ [0,1] and satisfy the Lipschitz condition on any interval 
that does not include 6. This class was considered by Neumann [19] in the 
context of density deconvolution. To allow more smoothness of / apart from 
the jump discontinuity, we introduce the following extension of T\. 

Definition 2. Let a,L > and m > 1 be fixed constants. We say that 
/ G T m = T m {a, L) if / € L2(M) and if / has a single change-point 8 G [0, 1] 
such that the following conditions hold: 
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(i) We have |[/](0)| >a. 

(ii) For all x 7^ 9 and [/'] (9) = 0, f'{x) exists so that the function gj : M —> 
R defined by 

is continuous. 

(hi) The function gj belongs to L2QR) and its Fourier transform gf sat- 
isfies 

/oo 
-00 

If m is an integer, condition (4) implies that the derivative ^ exists 
and is bounded by L. In fact, (4) is only slightly stronger than this property. 
For example, (4) is valid if gf is in a Sobolev class of L2 smoothness s > m — 
1/2, that is, when / \gf{ui)\ 2 \oj\ 2s du is bounded by an appropriate constant. 
This class is very close to, but smaller than, the class of functions with 
uniformly bounded derivative g^ m l \ 

It is important that in Definition 2 we have [/'](#) = 0. If [/'](#) 7^ 0, parts 
(ii) and (hi) of Definition 2 cannot be satisfied, but we may still consider 
classes of functions / that are smooth separately to the left and to the 
right of the change-point. However, introducing such classes seems to be 
unjustified, because additional smoothness in these terms does not improve 
the convergence rate of estimators of 9. The minimax rate remains the same 
as for the class T\. 

Definition 3. Let v, a, L > be fixed constants. We say that / € A u = 
A u (a,L) if / E L2QR), and if / has a single change-point 9 £ [0, 1] such that 
conditions (i) and (ii) of Definition 2 are satisfied and 

/oo 
\g f (Lu)\ 2 exp{2v\uj\}dLU <L 2 , 
-00 

where gf is defined in (3). 

Assumption (5) implies that gf is infinitely differentiable and admits an 
analytical continuation onto a strip in the complex plane. Such classes of 
functions have been studied in the context of nonparametric estimation by 
many authors, starting with Ibragimov and Hasminskii [14]. For a recent 
overview, see, for example, [2]. 

The following assumption on K will be used throughout this paper. 
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Assumption K. The function K belongs to Li(R), and there exist con- 
stants f3 > (called the ill-posedness index of K) and k, 7c > 0, such that 

(6) «(l + |w| 2 )- /J/2 <|^(w)|<7c(l + |w| 2 )-' 3/2 VwGR. 

Assumption K is quite standard in deconvolution problems and corre- 
sponds to what is known as a moderately ill-posed problem. Green's func- 
tions of linear differential operators are important examples of kernels K 
satisfying Assumption K for the regular case (f3 > 1/2). For instance, let 
v{x) = e x , — oo < x < 0, v(0) = 1/2 and v(x) = 0, < x < oo. For nonvanish- 
ing real constants bj, j = 1, . . . , k, we define 

(7) Vj(x) = \bj\v(bjx) and K = v\ * V2 * • • • * Vk, 

where * stands for the convolution on R. The Fourier transform of the kernel 
K is given by K{uj) = {Ilj=i(l ~~ 27rb~ 1 iuj)}~ 1 , and Assumption K holds 
with (5 = k. In this case / can be recovered from K/ by applying the linear 
differential operator 

(8) /(-)={n( i - 2 - & 7 l -^)}( K /)(-)^ 

see [12], Chapter II. As for the singular case (0 < (3 < 1/2), examples are 
more peculiar; for instance, one may consider K to be the probability density 
of a gamma distribution with shape parameter (3. 

3. Probe functional. We will develop estimation procedures that are 
based on minimization of an empirical version of a properly chosen probe 
functional. Let <p : R — > R be an even, twice continuously differentiable func- 
tion that attains its global maximum at 0. Further conditions on ip will 
be introduced below. Fix a bandwidth h > and for t G R, i£l define 
i/jt(%) = h~ 3 ip"(h~ 1 (x — t)). Let (•,■) denote the standard inner product in 
L,2(R). Assuming that ip" € Li(R) nL2(R), we define the probe functional 

/oo 
f{x)^ t {x)dx 
-oo 

(9) 

The probe functional £h(t) is thus a smoothed second derivative of / at 
point t: as h tends to zero, £h(t) converges to f"(t), provided that / is twice 
continuously differentiable at t. The points t where |^(t)| is close to zero are 
indicative of the change-point location; this idea underlies the construction. 
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An estimator of £h(t) based on observations (1) can be developed as fol- 
lows. Denote by K* the adjoint operator to K given by 

/oo 
K(y-x)g(y)dy, «?eL ; 
-oo 

By the linear functional strategy, if ipt £ Range(K*), then there exists a 
function j t G L2 (K) such that 

£ h (t) = (f,^) = (f,K* lt ) = (Kf, lt ). 

The function j t satisfies (K*j t ){x) = ipt(x) = h~ 3 p"(h~ 1 (x — t)) almost ev- 
erywhere in M. Taking the Fourier transforms, and using (10) and the fact 
that Tp t (uj) = -(2TTuj) 2 ip(ujh)e 2niujt , we find 



j t (u) = -(2iru) e 



K(-lo) 



We will always choose ip so that % 6 Li(R) nL2(R). Under this assumption 
we may write 

7t ( x )= [™ <y t (u>)e-* riux dw = - ^ {2™) 2 e 2 ™^B^-du> 
J— 00 J— 00 K(—u) 

and 

/oo /*oo 
f(x)Mx)dx= (Kf)(x) lt (x)dx. 
-00 J— 00 

Based on these considerations, we define the estimator £f l (t) of £h(t) by 

roc 

4(*)=/ jt(x)dY(x), t£R. 



Properties of estimation procedures that we develop are determined cru- 
cially by (i) accuracy of the probe functional estimator £h(t) and (ii) the 
ability of the probe functional to detect the change-point. The former is 
quantified by the next lemma, which we prove under the following assump- 
tion on the smoothing function ip. 

Assumption 1. The Fourier transform (p of ip € L 2 (M) satisfies 

\u\ 2f}+6 \ip(Lj)\ 2 du < 00; 
that is, 99 belongs to the Sobolev space with smoothness index (3 + 3. 
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Lemma 1. Let Assumption 1 and the left inequality in Assumption K 
hold, and let h> 0. Then the Gaussian random process Zh(t) = £h(t) — £h(t), 
t € B, where B is a subinterval of [0, 1], satisfies 

(11) E[Z h (t)] = 0, a 2 z = supE[z£(t)] < C^hT 2 ^. 

teB 

Furthermore, for any A > 2a z 

(12) p|sup|Z h (t)| > A| <C 2 (^ — ^ — j|i3|Aexp|-C 3 - 2 j, 

where \B\ stands for the Lebesgue measure of the set B and C{, i = 1,2,3, 
are positive constants. 

Further results will be obtained under the following condition which is 
stronger than Assumption 1. 

Assumption 2. The Fourier transform <p of ip 6 L2(R) is an even, non- 
negative, infinitely differentiable function supported on [—2/3,-1/3] U 
[1/3,2/3] and taking the value 1 on [-2/3 + ^,-1/3-/?] U [1/3 + tj, 2/3 - rj\ 
for some rj E (0, 1/32). 

It follows from Assumption 2 that (p is a real-valued, even, analytic func- 
tion, rapidly decreasing at infinity, together with all its derivatives. In ad- 
dition, since ip is nonnegative, ip achieves its global maximum at x = 0, 
<p'(0) = and |y?"(0)| > M > for some constant M. The Meyer wavelet 
(see, e.g., [17], Section 7.2.2), centered at zero and rescaled accordingly, pro- 
vides an example of a function that satisfies Assumption 2. 

We summarize some properties of that will be repeatedly used in what 
follows. 

(I) For all ip'(0) = and ip'(x) = -ip'{-x). 

(II) The function p' decreases in [0,3/8], increases in [3/4,9/8] and has 
a unique minimum in [3/8,3/4], which is the point of the global min- 
imum, x = q* £ [3/8,3/4]. By (I), tp' attains its global maximum at 
x = -q*e [-3/4,-3/8]. 
(Ill) There exists a unique zero of ip' in the interval [3/4,3/2]. We denote 

it by qo and let (i = f \qo — q*\ > 0. We have that tp' < on [0, qo] and 

(13) inf >'(x)-^*)}>r>0 

x:\x- q*\>d/2 

for some constant r. 
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Proofs of (I)— (III) are immediate and based on analysis of the integrand 
sign in the expressions for tp' and ip". Parameters g*, go and d that appear 
in (II) and (III) depend on the specific choice of ip; condition (13) asserts 
that g* is a well-separated point of the global minimum of <//. 

The next lemma analyzes the separation between values of the probe 
functional £h(t) when t varies in a "punctured" neighborhood of the change- 
point 9. 

Lemma 2 (Separation rate). Let Assumption 2 hold and let 5 E (0,qh), 
where q = q* + 3d/4, and the constants g* and d are given in (II) and (III). 

1. Let f E T m and let 

(14) 5>C l {L/a)h m+l 

for an absolute constant C\ > large enough. Then for sufficiently small h, 

(15) inf {\£ h (t)\-M0)\}>C 2 aSh- 3 , 
t:S<\t—8\<qh 

where C2 is a positive constant that depends on only a, L and (p. 

2. Let f E A v and let 

(16) S>C 3 (L/a)hexp{-u/(3h)} 

for an absolute constant C3 > large enough. Then (15) holds for sufficiently 
small h. 

The value that appears on the right-hand side (RHS) of (15) will be called 
the S-separation rate that corresponds to the probe functional i^. Lemma 2 
asserts that 9 is a well-separated point of minimum of |^(t)|, provided that 
h and 5 satisfy (14) and (16) for / E T m and / E A v , respectively. Condi- 
tions (14) and (16) are required to guarantee that the bias terms do not 
exceed the contrast expressed by the <5-separation rate. They also show that 
if / E A v , the value 5 can be chosen much smaller than in the case of / E T m \ 
that is, the minimum of is more pronounced when / E A u . It is inter- 

esting to note that if a smoothed first derivative of / is used as the probe 
functional and the maximum is sought, the corresponding ^-separation rate 
is of order 5 2 h~ 3 . As our proofs suggest, in the regular case this choice of 
the probe functional does not lead to a rate optimal estimation procedure 
(see Section 5). 

4. Estimation procedure and main results. We are now in position to 
define the estimation procedure. The construction has two stages: First we 
localize the region that contains the change-point with probability close to 
1 and then we search for a minimum of the absolute value of the probe 
functional inside the region. 
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The localization step is based on the following argument. As the proof 
of Lemma 2 shows, ihif) is equal to —h~ 2 ip'(h~ 1 (9 — t))[f](9) up to a term 
that is negligible, provided that \6 — t\ = 0(h). It follows from (II) that 

t* d = f argmin te[0]1 ]{-v9 / ((6' - and t* d = argmax t€[0A] {-(p' ((9 - 

t)h~ 1 )[f](9)} are within the distance 0(h) from 9: 

\t*-6\=q*h€ [3h/8,3h/4], \t* -9\=q*h£ [3h/8, 3h/4]. 

In addition, \t* - t*\ = 2q*h £ [3/i/4, 3/i/2]. If [f](0) < 0, then t* > t* and 
[t*,t*] contains 6; if [/](#) > 0, then t* < U and 9 G [t*,i*]. Both i* and i* 
can be estimated from the data. This fact will be used to find an interval of 
size 0(h) that contains the change-point with probability close to 1. 
Let 

(17) t* = f axgmini/i(i), i* = f argmax^(t) 

te[o,i] te[o,i] 

and let -A^ be the closed interval with endpoints i* and i*. Our estimator 
9h of the change-point for given bandwidth h is defined by 

(18) 4 = argmin|4(i)|- 

t&A h 

We note that this construction depends on the bandwidth h that will be 
chosen in an optimal way. 

4.1. Functional class T m . 

Theorem 1. Suppose that the left inequality in (6) is satisfied and that 
Assumption 2 holds. Let 9* denote the change-point estimator 9^ with the 
bandwidth h = h* defined below. 

1. Regular case: Assume that > 1/2 and let 

(19) K = Cl{e/Lf^ 2m+2 ' 3+1 \ 

where CI > is o constant. Then there exists a constant C| < oo indepen- 
dent of a, L such that 

Re[6,;r m ] < Qa- 1 L< 2 ' ? - 1 )/^ +2m+1 ) e ( 2m+2 )/( 2m+2 ' ? + 1 ) V0 < s < 1. 

2. Singular case: Assume that < /3 < 1/2 and Zei 

t 2/(2/3+1) 

(20) ft. 




where C% > is a sufficiently large constant. Then there exists a constant 
C| < oo suc/i i/iaf 

/ i x (l-2/3)/(2(2/3+l)) 

^m] < Qe 2 /( 2/3+1 ) flnij V0 < e < 1. 
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Theorem 2. Let the right-hand side inequality in (6) hold and assume 
that \K(uj)\ 7^ for all uo. Then, for sufficiently small e, the minimax risk 
over the class T m is bounded from below as follows: 

1. Regular case: If f5 > 1/2, then 

(21) Rt[T m ] > c * G -l L (2/3-l)/(2/ 3 +2 m +l) e (2 m +2)/(2 m +2 /3+ l) i 

where c\ does not depend on a,L. 

2. Singular case: If < < 1/2, then 

(22) R* [T m ] > I c 2 £ ( ln g) ' { f 13 = l l 2 ' 

[ c * £ 2/(2/3+i) ; if < (3 < 1/2, 

where 02,03 > are constants. 

Theorems 1 and 2 show that the estimate 9* is rate optimal in the regular 
case. Moreover, the bounds identify the precise dependence of the minimax 
risk on the size of the jump a and on the "Lipschitz constant" L away 
from the jump (see definition of the class J- m ). As smoothness of / away 
from the jump increases (i.e., as m — > 00), the optimal rate of convergence 
approaches the usual parametric rate e. In the singular case, 6* is nearly 
rate optimal up to a factor logarithmic in e _1 . Here the order of the rate 
of convergence is faster than the parametric rate e, but slower than e 2 , the 
minimax rate achieved in change-point estimation under direct observations. 
It follows from Lemma 3 in Section 6 that the estimators t* and t* defined in 
(17) and associated with the bandwidth (20) are also nearly rate optimal in 
the singular case. We conjecture that for < (3 < 1/2 the extra logarithmic 
factor in the bound of Theorem 1 can be removed and thus e 2 /( 2 ^ +1 ) is the 
optimal rate of convergence for such values of j3. 



4.2. Functional class A, 



Theorem 3. Suppose that the left-hand side inequality in (6) is satisfied 
and that Assumption 2 holds. Let + denote the change-point estimator 6^ 
with the bandwidth h = h+ defined below. 

1. Regular case: Assume that (3 > 1/2 and let 
(23) h+ = v Jl n k.( (3+ l \JUn- 



Then there exists a constant C5 < 00 independent of v, a, L such that 

/1 r\/9-l/2 
R E [e + ;A v } <Cla~ x e -In- V0<e<l. 
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2. Singular case: Assume that < (3 < 1/2 and let, for some sufficiently 
large C@ > 0, 

2/(2/3+1) 




Then there exists a constant Cj < oo such that 



/ -1 \ (1-2/3)/ (2(2/3+1)) 

Re [0+ ; A v ] < C*£ 2/(2/3+1) In - V < e < 1 . 



V e 

Theorem 4. Lei i/ie right-hand side inequality in (6) ZioZii. TTien /or 
sufficiently small e the minimax risk over the class A v is bounded from 
below as follows: 

1. Regular case: If P > 1/2, i/ien 

/I l\£-i/2 
(25) Rl[A v )>cla- l e -ln-1 



where c| is a constant independent ofi/,a,L. 
2. Singular case: 7/0 < /? < 1/2, i/ien 

(26) > | C ^( ln e) ' z /^ =1 /2, 

[ c * £ 2/(2/3+i) ) ifO<P<l/2, 

where c^ , Cg > are constants independent of v . 

Theorems 3 and 4 indicate that 6 + is rate optimal in the regular case, 
and nearly rate optimal in the singular case. It is interesting to note that in 
the regular case, when / £ A u , almost parametric rates of convergence are 
attained by our estimation procedure. The ill-posedness of the convolution 
operator K, as expressed by index P, does not have a significant effect on 
the rates of convergence when / G A u \ this fact is rather surprising. 

5. Discussion. 1. Our technique elucidates how the construction of the 
probe functional affects estimation accuracy. An appropriate probe func- 
tional, £h, should satisfy the following two requirements: (i) 8 is a well- 
separated point of minimum (maximum) of ih{') or (ii) th(t) admits 
a "good" estimator with "small" bias and variance. The proofs suggest that 
in the regular case the optimal rates of convergence are obtained by bal- 
ancing three quantities: the <5-separation rate, the bias and the stochastic 
error of estimation of a properly chosen probe functional. In the singular 
case the bias is asymptotically negligible and the optimal rates are obtained 
by balancing only two terms: the <5-separation rate and the stochastic error. 
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As an illustration, consider, for instance, the regular case and estimation 
on classes J- m . For our functional lh, the ^-separation rate in (15) is of 
order 5/i -3 , the stochastic error is characterized by the square root of the 
variance Var[£/j(t)] = 0(e 2 h~ 2 ^~ 5 ) [see (11)] and the bias term is of order 
h m ~ 2 [see (30) and (32)]. The balance between the three terms is given 
by the relationships 8h~ 3 x h m ~ 2 x e/i"' 3 " 5 / 2 . Solving for this, we get the 
optimal bandwidth h x £ 2/(2m+2/3+i) anc j corresponding optimal rate 

£ ^ £ (2m+2)/(2m+2/3+l) _ 

2. The proposed estimator is based on a local search for a zero of the 
smoothed second derivative of /. An alternative and seemingly natural ap- 
proach would be to estimate 6 by searching for a maximum of a smoothed 
first derivative of /, that is, to consider the probe functional Wh{t) = 
h~ 2 J f(x)ip'(h~ 1 (x — t))dx. This, however, does not lead to rate optimal es- 
timation in the regular case. Although the stochastic error of the correspond- 
ing estimator Wh(t) is smaller than that of £h(t) [V&r[wh(t)] = 0(e 2 /i~ 2/3 ~ 3 )], 
the 5-separation rate is of order <5 2 /i~ 3 . The bias term is now of order h m ~ 1 ; 
this follows from similar arguments as in the proof of Lemma 2. By bal- 
ancing the three terms (bias, stochastic error and 5-separation rate), as 
explained in the previous remark in this section, it is not difficult to verify 
that the estimator based on a local maximization of |?D/j(i)| has risk of order 
£ (m+2)/(2m+2/3+i) when J g jF m and (3 > 1/2. We recall that the optimal rate 
of convergence given by Theorem 1 is faster, e ( 2m + 2 )/( 2m + 2 / 3 + 1 ) _ 

3. The results of the present paper cover the problem of estimating change- 
points in the /3th derivative of a function from direct observations. In par- 
ticular, assume that f3 is an integer and let K be the Green's function of 
a linear differential operator of order (5 as defined in (7). Denoting q = K/, 
we note that estimating the change-point in / from indirect observations 
(1) is equivalent to estimating the change-point in q(P) from direct observa- 
tion of q in the white noise model. Indeed, in view of the inversion formula 
(8), if / has a change-point at 9, then q^ (or any linear differential form 
of q of order (3) has a change-point at 6 as well. Therefore, regarding the 
observations of q in the white noise model as indirect observations of /, 
with K being the Green's function of a linear differential operator, we can 
apply our procedure to estimation of change-points in q(@\ In particular, 
according to Theorems 1 and 2, if q^ satisfies the Lipschitz condition away 
from the change-point (i.e., m = 1), the best achievable rate of convergence 
is e 4 /( 2 ^+ 3 ), which can be easily extended to the regression problem with 
equidistant design, where the rate becomes n~ 2 ^ 2/3+ ^ . This indicates that 
change-point estimation procedures in [21], as well as in [13, 20, 26], are not 
rate optimal, contrary to what is claimed in some of these papers. Specif- 
ically, as a referee pointed out, the lower bound of Raimondo [21] is not 
correct. At the same time, our results are consistent with those obtained 
for density deconvolution by Neumann [19], who only considered the case of 
Lipschitz smoothness (m = 1). 
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6. Proofs and auxiliary results. In what follows C, c, Cj and Cj, i = 

1,2, ... , stand for positive constants that may differ on different occurrences. 

Proof of Lemma 1. Assumptions 1 and K imply that j t G L 2 (M). 
Since / G L 2 (R) and A" G Li(R) we have that K/ G L 2 (R); thus the Fourier 
transform K/ exists and K/ = A/. Using (1) and Plancherel's formula we 
get, for any t G B, 



E/[4(*)]=/ 7t(*)(K/)(x)dx 



-oo 



7t(w)A(a;)/(a;)(i(j 



oo 

oo 



(2nco) 2 e 2wiu}t ip(ujh)f(uj)dw 



Mu)f(u)dcu = (f,tl; t )=e h (t), 

which proves that E[Z/ l (i)] = 0. Thus Zh(t) is a zero mean Gaussian random 
variable with variance 



E[Z 2 (t)]=E 



7t(x) dW(a;' 



e 2 / |7t(^)| Z ^ 



|jr(-w)l a 



OO 

2 



^ c i/^+5 y_j£Mi 2 (i+ny +2 ^ 

<c 2e 2 /r 2 ^ 5 , 

where we have used Assumptions K and 1. This proves (11). 

To prove (12) we apply the inequality on the tails of Gaussian processes 
(see, e.g., [24], Proposition A. 2. 7). For this purpose, using Plancherel's for- 
mula and Assumptions K and 1 we obtain, for t, s G [0, 1], 

a 2 (t,s)=E\Z(t)-Z(s)\ 2 

POO 

= e 2 / ht(x) --f s (x)\ 2 dx 



oo 

e 2 f°° |2vM 4 Ijjfc^l 2 le 2 ™* - e 2 ™ s \ 2 du 



\K{-u>) 
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roo 

< c 3 e 2 \t-s\ 2 / M 6 (l + \u\ 2 f\{p(uh)\ 2 du 



< c 3 e 2 h- 2 ?- 7 \t-s\ 2 / {l + \u\ 2 f^\!p{u)\ 2 dw 

J — oo 

<c 4 e 2 h-^- 7 \t-s\ 2 . 

Therefore, the number of balls of radius r in the seminoma a(t, s) that 
cover the interval B C [0, 1] does not exceed c^r~ l eh~^~'' 'l 2 \B\, and applying 
Proposition A. 2. 7 of [24] (putting in the notation of that proposition £o = crz, 
K = eh~P~ 7 / 2 \B\), we get the lemma. □ 

Proof of Lemma 2. 1. Fix t satisfying 6 < \t — 9\ < qh and define 
T = (9-t)/h; clearly q> |r| > S/h. By (9), 

(27) 

I rr l roo 

= j^j v"(x)f(t + xh)dx + - 2 j ip"(x)f(t + xh)dx. 
First assume that m = l. Let 

J iW = p </'(*)\f(t + xh ) - /OH] dx 

1 r°° 

Then using Definition 1 and the fact that ip'(-oo) = <p'(oo) = 0, we obtain 
4(i) = ^'(r)f(e-) - i^'( T )/(0+) + Ji(t) 

(28) 

= -^'(r)[f](9) + J 1 (t). 

Recall that y/(0) = and |</(0)| > M > 0. In addition, by (I)-(III), ip'(x) 
has a unique zero in the interval [—9,9] at x = 0. Therefore, |y/(r)| > c±\t\ 
for all |r| € (5/h,q) and, for small enough, we get 

(29) J_|[/](^)||^( T )| > paa^lrl > c 3 a^ 3 . 

Furthermore, we note that £h(0) = J\{9) and that, for all t satisfying 5 < 
\t — 9\ < qh, 

POO 

h 

Using (14) we conclude that for sufficiently small h, the sign of £h(t) is 
determined by the first term in (28). Therefore, (15) holds for m = 1. 



(30) \Ji(t)\<^ I \ip"{x)\\T -x\dx<c i Lh- 1 . 
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If m > 1, then integrating by parts in both integrals on the RHS of (27) 
and using the fact that ip'(— oo) = ip'(po) = 0, we obtain 

(31) h(t) = -^'(r)[f](9)-J 2 (t) 



with 



J2{t) d = - ( ip'(x)gf(t + xh)dx 
h> J — oo 

{-2muj)u3{uj)e 2mut/h g f {-uj/h) dw, 



oo 
oo 



1 

where gj is defined in (3) and the last equality follows from the Plancherel 
formula. In view of Assumption 2 and Definition 2, 



(32) |J 2 (i)| < =5 max f°° M m-1 |Sf {u/h)\ du < c 5 Lh m ~ 2 . 

v 1 1 Wl ~ /i 2 i/3<|w|<2/3 \oj\ m ~ 2 7_oo 7 ~ 

This along with (14), (29), (31) and the fact that 4(0) = ~M e ) completes 
the proof of the first statement of the lemma. 

2. If / G A v , then £h(t) is again given by (31) and J 2 (£) is now bounded 
using the Cauchy-Schwarz inequality: 



|J 2 (t)| <4tt 2 [ f \u\ 2 \(p{uh)\ 2 exp{-2i/|w|} 
I j —00 



1/2 



30 ] 1/2 

|<7/(w)| exp{2z/|a;|} duo > 
00 J 

<C6< / |w| 2 exp{— 2i/|cj|} du > 

Ul/(3/i)<|w|<2/(3/i) J 

< c 7 L/i~ 2 exp{-i//(3/i)}. 
The same considerations as above complete the proof. □ 



1/2 



Lemma 3. Let Assumption 1 and the left inequality in Assumption K 
hold, and let /i /3+1 / 2 e~ 1 > C\. Then, for f G or / G .A„ and /or a// /i 
small enough, 

max{P/{|i -U\ > hd/2},F f {\i* -t*\ > hd/2}} 
where d > is given in (III). 
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Proof. We will derive the inequality for P/{|£* — 1*| > hd/2} only; the 
proof of the other part is identical in every detail. Define A = {t £ [0, 1] : \t — 
t*\ > hd/2}. By definition of t* and t*, we have 

F f {\U-U\>hd/2}<F f {3teA:£ h (U)>I h (t)} 

= ¥ f {3 1 G A : [£ h (U) ~ 4(**)] + [4(0 - 4(0] 

>£h(t)-£ h (U)} 



<P/ 2 sup |4(0 "4(01 >mf (4(0 "4(**)) 
I te[o,i] teA 

It follows from (28) and (31) that 

4(0 ~4(**) = - + ^(0 " ^*), 

where J(t) d = Ji(t) if / € .Fi and J(i) d = - J 2 (t) if / e F^, m > 1 or / 6 A V) 
with Ji(0 and J 2 (0 as defined in the proof of Lemma 2. Therefore, we 
obtain 

inf(4(t)-4(i*)) 
(33) > S ^ WW ^(^)_^(^Z*)}_2 S |J W | 

>f| inf >'(x)-^(^)}-2sup|J(t)|, 

where the last inequality follows by change of variables, by definition of 
tif and because tp'(x) = —ip'(—x). Using property (III), we obtain that the 
first term on the RHS of (33) is at least c\rh~ 2 , while the second one does 
not exceed in absolute value C2Lh m ~ 2 if / G T m and c^Lh~ 2 exp{— z//(3/i)} 
if / € «4[/ (see the proof of Lemma 2, where upper bounds on | J(t)\ were 
established). Noting that h~ 2 dominates both h m ~ 2 and h~ 2 exp{— v/(3h)} 
as /i tends to zero and applying Lemma 1, we obtain that 



P/{|i - t*| > /wi/2} < P / ( sup |4(0 - 4(01 > <%h 

Ue[o,i] 

<c 5 ^~ 1/2 e~ 1 exp<^ -c 6 



e 2 



as claimed. □ 



Proof of Theorem 1. 1. We begin with the regular case. The choice 
olh = h„ in (19) implies that £~ 1 h p + 1 / 2 x e -m/(m+/3+i/2) ^ go Lemma 3 can 
be applied. Let f2 be the event that |t* — t*\ < W/2 and |i* — t*| < hd/2. 
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Recall that |i* — 9\ = \t* — 0\ = q^h, where g* is defined in (II). Therefore, on 
the set Q, 

(34) \t*-0\<q*h + hd/2<qh and \i* - 0\ < q*h + hd/2 < qh, 

where q is defined in Lemma 2. Recall that, by property (III) ip'(x) has a 
unique zero in the interval [—q, q] at the point x = 0. This guarantees that if 
f2 holds, the set contains a unique zero of the function 1 1— > y/(/i _1 (6 — t)) 
at t = 6 and thus the definition (18) is justified. 
We write 

E f \e h - e\ 2 = E f {\e h - e\ 2 i(n)} + E f {\e h - e\ 2 i(n c )} 

(35) 

<M f {\e h -e\ 2 i(n)}+¥ f (n c ), 

where l(-) denotes the indicator function. By Lemma 3 we have 
F(Q C ) = P{(|i -U\> hd/2) U -t*\> hd/2)} 

(36) < dhP+We- 1 exp{- C2 ^ +1 e - 2 } 

< C3£ -m/(m+f3+l/2) ^{.^-(^/(m+^l/^ } _ 

Furthermore, when O holds, it follows from (34) and from the construction 
of 6h that \8h — 6\ < qh. Thus for any 5 6 (0, qh), the first term on the RHS 
of (35) can be bounded as 

J 

(37) E f {\0 h - e\ 2 i(n)} <5 2 + Y, 5 2 2 2j P/({|4 -e\€ a 3 } n n), 

3=1 

where Aj d = [S2 j - 1 , 82?] and J = min{j : 62* > qh}. Let Tj = {t:\t- 6\ G Aj}; 
we note that \Tj \ = 82 3 . Then we have 

¥ f ({\e h -e\ eAjjnn) 

(38) <P / (3ter j :|4(0)|>|4(t)|) 

< P/{2 sup |4(i) - 4(01 > mf (|4(t)| - 14(^)1) j- 

We first estimate inLjg^. (|4(*)| ~~ 14(0)1) using Lemma 2. Note that Lemma 2 
can be applied with {t : 5 < \t — 6\ < qh} replaced by Tj for each j = 1, . . . , J, 
provided that 82^~ x > c^{L / a)h m+1 . In particular, we have 

inf (|4(t)| - |4(0)|) > x . mf (|4(t)| " 14(0)1) 

teTj t:623- 1 <\t-e\<qh 

(39) 

>c 6 a<5/i~ 3 2 J , j = l,...,J. 
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Let 

(40) 5 = C 7 (L/a)h m+1 = C8jL (2^-l)/(2/3+2m+l) a -l (£ 2 ) (m+l)/(2^+2m+l)_ 

It is straightforward to verify that with this choice of 5 and h for suffi- 
ciently large cj, conditions of Lemma 2 are satisfied. In addition, 2 J a<5/i~ 3 > 
cge/i - ' 3-5 / 2 for some constant eg and each j = 1, ... J. Therefore, using (15) 
and applying Lemma 1, we obtain 



P/<^ sup | Z h (t) | >c 6 adh~ 3 2 : > 

< cio f —— J \Tj\a5h~ 3 2 J expj -c n ± ^ \ 

<ci2 — - — a5 2 2 2j h 3 expj-ci 3 2 2j f> 

A/3-3/2 

< ci2^^a5 2 2 2j 'exp{-ci 4 2 2j '}, 

where we have taken into account that 5 2 h 2l3 ~ 1 e~ 2 > c > under (19) and 
(40). Note also that C14 can be made large enough by choice of cs in (40). 
Furthermore, since /j/ 3 " 3 / 2 ^" 1 = / l /3+2"i+i/2 £ -i = £ m/(/?+m+i/2) = Q ^ ag 

e — > 0, we finally obtain from (41), (38) and (37) that 

J 

Ef{\§ h - 9\ 2 l{n)} <5 2 + S 2 o(l) £ 2 2 ' exp{-ci 4 2^} 

i=i 

<5 2 (l + o(l)). 

Combining this with (35) and (36), we complete the proof of the first part 
of the theorem. 

2. For the singular case, the proof follows the same lines with minor 
modifications; we indicate them below. 

The choice of h in (20) ensures that h^ 1 / 2 e 1 = ci^Vlne 1 so that 
Lemma 3 applies. In addition, by choice of C| large enough, P(f2 c ) = o(h 2+v ) 
for any rj > and e — ► 0. Arguing as in the proof of the first part, we see 
that inequalities (37)-(39) hold. For some constant c\q > 0, let 

/ I r \ (-/3+l/2)/(/3+l/2) 

(42) 5 = c 16 sh-P +1 / 2 = c 17 e 1 /(/3+i/2) ( ; / ln I 
Under this choice, 



e 



mU\i h {t)\-\l h {0)\)> Cl z8hrHi 

>c 19 eh-^ 2 2^ j = l,...,J. 
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The last inequality ensures that Lemma 1 can be applied and, similarly to 
(41), we have 



hp-*' 2 - „ r ptivhP- 



A sup \Z h {t)\ > c 18 5h- 3 V < c 20 5 2 2 2 > exp -c 21 



< 022^72 2 23 ' exp{- C23 2 2j '}. 

Substituting expression (20) for /i and summing up over j = 1, J, we 
complete the proof. □ 

Proof of Theorem 2. We use the method of proving minimax lower 
bounds based on a reduction to the problem of testing two simple hypothe- 
ses; see, for example, [16], Chapter 2, or [23], Chapter 2. 

Pick /o G JF m (a,L/2) such that /q has a unique jump discontinuity at 
#o = and [/o](0) = a. Fix J G (0, 1] and define v{x) = al^^x), x G R. The 
Fourier transform of u is given by 

v(u) = a [ S e 27riuJX dx = ^—[e 2 ™ 5 - 1]. 
Jo 2niuj 

Fix N > and define 

v N {x) = / v(u)e- 2 ™ x " dw, x G R. 

The Fourier transform of this function is vn(uj) = v(lv)1(\lo\ < N). Let 

fi(x) =fo(x) - [v(x) -v N (x)}, xGR. 

The function x \— > fo(x) — v(x) has a unique jump at x = 5 with [/q — u] (<5) = 
—a, while ujv is infinitely differentiable. Therefore, x *— > f\{x) has a unique 
jump at x = 5 with [/i](£) = —a- Set #i = 5, where the index 1 indicates that 
9\ is the change-point of f\. 

We now show that f\ G J r m (a,L) under appropriate choice of N. First, 
clearly f\ G L 2 (R) , since fo,v, vjy G L 2 (R) . Next, the Fourier transform of the 
derivative v' N (x) is given by (Fv' N )(u>) = (—2ttilo)vn(lo) = (— 2ttiu>)v(u;)1(\uj\ < 
N) and 

/■OO />00 

\(Fv' N )(u})\\cj\ m ~ 1 du} = 2TT I \v N {u)\\uj\ m duj 
(43) =al \& iKiuS -l\\u\ m - 1 dijj 

m+ 1 



— OO 

N 
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In what follows choose N = (^j^— ) 1// ( m+1 ^ . Then the expression in (43) is 
less than L/2. 

First let m = 1. Then (43) implies that the derivative |?^y(:c)| is uniformly 
inxGR bounded by L/2 and thus vn is Lipschitz continuous with Lipschitz 
constant L/2 on P. Also, fo — v has this property apart from Q\ = 5. Hence, 
fi is Lipschitz continuous with Lipschitz constant L apart from 9\ = 5, which 
proves that f± G JP"i(a,L). 

Now let m > 1. Then gf 1 = gj +v' N , with gj defined in (3) and 

/oo 
l^oMIM" 1 - 1 ^^, 
-oo 

since /o G J r m {a,L/2). This and (43) prove that, under our choice of N, 

/oo 
{gf^W^r^duKL. 
-oo 

We have thus shown that f\ G ^" m (a,L). 

For brevity, let Po and Pi denote the probability measures associated with 
the observations Y = {Y(x) :iGK) in model (1) with / = /o and f = fi, 
respectively. In view of Girsanov's formula, the Kullback-Leibler divergence 
between Po and Pi has the form 

/C(P ,Pi) d ^/ ln^dPo = ^||K(/ - ml 

The function A = /q — fi = v — belongs to L2 (M) and its Fourier transform 
is given by v n {oj)1{\uj\ > N). Since K G L X (R), KA exists and KA = KA. 
Hence, by Plancherel's formula, 

(44) /c(P ,Pi) = -^ / i^hi^hi 2 ^. 

Assume first that (3 > 1/2. Then 

W,Pi)<ci^V^ +1 

= C 2 £~ 2 L* 1 - 2 ^/^ 1 ) ( a( 5)(2/9+2m+l)/(m+l) ^ 

where we have used Assumption K and the fact that |u(u;)| < a5 Vcj G PL 
Choosing 

5 >c a - 1 £( 2 / 3 - 1 )/( 2 / 3 + 2m + 1 ) e 2 ( m + 1 )/( 2 / 3 + 2m + 1 ) 

we ensure that /C(Po,Pi) < a < 00 for e small enough. On the other hand, 
|$o — $1 1 =8 an d it follows from part (hi) of Theorem 2.2 in [23] that 
swp fe:Fm E f \e - 9\ 2 > c 3 5 2 . This completes the proof of (21) for (3 > 1/2. 
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Now let < (3 < 1/2. We decompose the domain of integration in (44) into 
two parts: N < \oj\ < N' and \oj\ > N', where N' = 1/5. For N < \u\ < N' 
we bound the integrand as above, while for \u\ > N' we use the fact that 
1^(^)1 < (^M) ■ This yields 

(45) K(P ^)<J^f ™ + r£&\ 

\e Jn<\u\<n> M zp e A J\uj\>n' \u\ z+ip J 

Hence, for (3 = 1/2 we obtain 

?2 



/C(Po,Pi)<c 5 (^lniV' + ^). 



and the choice of 5 x e(lni) -1 / 2 allows us to conclude the proof using the 
same argument as in the case of (3 > 1/2. Finally, for < /3 < 1/2, we get 
from (45) that 

'5 2 {N') 1 " 2 ^ , 1_ 

' e 2 {N'f 

and the choice of 8 x e 2 /( 2 / 3 + 1 ) yields the boundedness of the last expression 
and hence the desired result. □ 



/C(P , Pi) < C 6 + e2{N , )W+l ) 



The proofs of Theorems 3 and 4 follow the same steps as the proofs of 
Theorems 1 and 2 with slight modifications and are omitted. 

REFERENCES 

[1] Antoniadis, A. and Gijbels, I. (2002). Detecting abrupt changes by wavelet meth- 
ods. J. Nonparametr. Statist. 14 7-29. MR1905582 

[2] Belitser, E. and Levit, B. (2001). Asymptotically local minimax estimation of 
infinitely smooth density with censored data. Ann. Inst. Statist. Math. 53 289- 
306. MR1841137 

[3] Brown, L. D. and Low, M. G. (1996). Asymptotic equivalence of nonparametric 
regression and white noise. Ann. Statist. 24 2384-2398. MR1425958 

[4] Carlstein, E., Muller, H.-G. and Siegmund, D., eds. (1994). Change-Point Prob- 
lems. IMS, Hayward, CA. MR1477909 

[5] Carroll, R. J. and Hall, P. (1988). Optimal rates of convergence for deconvolving 
a density. J. Amer. Statist. Assoc. 83 1184-1186. MR0997599 

[6] Cavalier, L. and Tsybakov, A. B. (2002). Sharp adaptation for inverse problems 
with random noise. Probab. Theory Related Fields 123 323-354. MR1918537 

[7] Fan, J. (1991). On the optimal rates of convergence for nonparametric deconvolution 
problems. Ann. Statist. 19 1257-1272. MR1126324 

[8] Fan, J. and Koo, J.-Y. (2002). Wavelet deconvolution. IEEE Trans. Inform. Theory 
48 734-747. MR1889978 

[9] Gijbels, I., Hall, P. and Kneip, A. (1999). On the estimation of jump points in 
smooth curves. Ann. Inst. Statist. Math. 51 231-251. MR1707773 
[10] Goldenshluger, A. (1999). On pointwise adaptive nonparametric deconvolution. 
Bernoulli 5 907-925. MR1715444 



24 



A. GOLDENSHLUGER, A. TSYBAKOV AND A. ZEEVI 



[11] Gonzalez, R. C. and Woods, R. E. (1992). Digital Image Processing. Addison- 
Wesley, Reading, MA. 

[12] Hirschman, I. I. and Widder, D. V. (1955). The Convolution Transform. Princeton 

Univ. Press. MR0073746 
[13] Huh, J. and Carriere, K. C. (2002). Estimation of regression functions with a 

discontinuity in a derivative with local polynomial fits. Statist. Probab. Lett. 56 

329-343. MR1892994 

[14] Ibragimov, I. A. and Hasminskii, R. Z. (1983). Estimation of distribution density. 

J. Soviet Math. 25 40-57. 
[15] Korostelev, A. P. (1987). Minimax estimation of a discontinuous signal. Theory 

Probab. Appl. 32 727-730. MR0927265 
[16] Korostelev, A. P. and Tsybakov, A. B. (1993). Minimax Theory of Image Re- 
construction. Lecture Notes in Statist. 82. Springer, New York. MR1226450 
[17] Mallat, S. (1998). A Wavelet Tour of Signal Processing. Academic Press, 

San Diego, CA. MR1614527 
[18] Muller, H.-G. (1992). Change-points in nonparametric regression analysis. Ann. 

Statist. 20 737-761. MR1165590 
[19] Neumann, M. H. (1997). Optimal change-point estimation in inverse problems. 

Scand. J. Statist. 24 503-521. MR1615339 
[20] Park, C.-W. and Kim, W.-C. (2004). Estimation of a regression function with a 

sharp change point using boundary wavelets. Statist. Probab. Lett. 66 435-448. 

MR2045137 

[21] Raimondo, M. (1998). Minimax estimation of sharp change points. Ann. Statist. 26 
1379-1397. MR1647673 

[22] Spokoiny, V. (1998). Estimation of a function with discontinuities via local polyno- 
mial fit with an adaptive window choice. Ann. Statist. 26 1356-1378. MR1647669 

[23] Tsybakov, A. B. (2004). Introduction a V estimation non-parametrique. Springer, 
Berlin. MR2013911 

[24] VAN der Vaart, A. and Wellner, J. (1996). Weak Convergence and Empirical 
Processes. With Applications to Statistics. Springer, New York. MR1385671 

[25] Wang, Y. (1995). Jump and sharp cusp detection by wavelets. Biometrika 82 385- 
397. MR1354236 

[26] Wang, Y. (1999). Change-points via wavelets for indirect data. Statist. Sinica 9 
103-117. MR1678883 

[27] Yin, Y. Q. (1988). Detection of the number, locations and magnitudes of jumps. 
Comm. Statist. Stochastic Models 4 445-455. MR0971600 

a. goldenshluger a. tsybakov 

Department of Statistics Laboratoire de Probabilities 

University of Haifa et Modeles Aleatoires 

Haifa 31905 Universite Paris VI 

Israel 4 Place Jussieu 

E-MAIL: goldonsh@stat.haifa.ac.il Paris 75252 

France 

E-MAIL: tsybakov@ccr.jussicu.fr 

A. Zeevi 

Graduate School of Business 

Columbia University 

3022 Broadway 

New York 10027 

USA 

E-MAIL: assaf@gsb.columbia.edu 



